Length bias in Encoder Decoder Models and a Case for Global Conditioning

نویسندگان

Pavel Sountsov

Sunita Sarawagi

چکیده

Encoder-decoder networks are popular for modeling sequences probabilistically in many applications. These models use the power of the Long Short-Term Memory (LSTM) architecture to capture the full dependence among variables, unlike earlier models like CRFs that typically assumed conditional independence among non-adjacent variables. However in practice encoder-decoder models exhibit a bias towards short sequences that surprisingly gets worse with increasing beam size. In this paper we show that such phenomenon is due to a discrepancy between the full sequence margin and the per-element margin enforced by the locally conditioned training objective of a encoder-decoder model. The discrepancy more adversely impacts long sequences, explaining the bias towards predicting short sequences. For the case where the predicted sequences come from a closed set, we show that a globally conditioned model alleviates the above problems of encoder-decoder models. From a practical point of view, our proposed model also eliminates the need for a beam-search during inference, which reduces to an efficient dot-product based search in a vector-space.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decoding Coattention Encodings for Question Answering

An encoder-decoder architecture with recurrent neural networks in both the encoder and decoder is a standard approach to the question-answering problem (finding answers to a given question in a piece of text). The Dynamic Coattention[1] encoder is a highly effective encoder for the problem; we evaluated the effectiveness of different decoder when paired with the Dynamic Coattention encoder. We ...

متن کامل

Controlling Output Length in Neural Encoder-Decoders

Neural encoder-decoder models have shown great success in many sequence generation tasks. However, previous work has not investigated situations in which we would like to control the length of encoder-decoder outputs. This capability is crucial for applications such as text summarization, in which we have to generate concise summaries with a desired length. In this paper, we propose methods for...

متن کامل

Incorporating Structural Alignment Biases into an Attentional Neural Translation Model

Neural encoder-decoder models of machine translation have achieved impressive results, rivalling traditional translation models. However their modelling formulation is overly simplistic, and omits several key inductive biases built into traditional models. In this paper we extend the attentional neural translation model to include structural biases from word based alignment models, including po...

متن کامل

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

Digital surface model extraction with high details using single high resolution satellite image and SRTM global DEM based on deep learning

The digital surface model (DSM) is an important product in the field of photogrammetry and remote sensing and has variety of applications in this field. Existed techniques require more than one image for DSM extraction and in this paper it is tried to investigate and analyze the probability of DSM extraction from a single satellite image. In this regard, an algorithm based on deep convolutional...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Length bias in Encoder Decoder Models and a Case for Global Conditioning

نویسندگان

چکیده

منابع مشابه

Decoding Coattention Encodings for Question Answering

Controlling Output Length in Neural Encoder-Decoders

Incorporating Structural Alignment Biases into an Attentional Neural Translation Model

Link Prediction using Network Embedding based on Global Similarity

Digital surface model extraction with high details using single high resolution satellite image and SRTM global DEM based on deep learning

عنوان ژورنال:

اشتراک گذاری